Overlaying classifiers: a practical approach to optimal scoring
نویسندگان
چکیده
The ROC curve is one of the most widely used visual tool to evaluate performance of scoring functions regarding their capacities to discriminate between two populations. It is the goal of this paper to propose a statistical learning method for constructing a scoring function with nearly optimal ROC curve. In this bipartite setup, the target is known to be the regression function up to an increasing transform and solving the optimization problem boils down to recovering the collection of level sets of the latter, which we interpret here as a continuum of imbricated classification problems. We propose a discretization approach, consisting in building a finite sequence of N classifiers by constrained empirical risk minimization and then constructing a piecewise constant scoring function sN(x) by overlaying the resulting classifiers. Given the functional nature of the ROC criterion, the accuracy of the ranking induced by sN(x) can be conceived in a variety of ways, depending on the distance chosen for measuring closeness to the optimal curve in the ROC space. By relating the ROC curve of the resulting scoring function to piecewise linear approximates of the optimal ROC curve, we establish the consistency of the method as well as rate bounds to control its generalization ability in sup-norm. Eventually, we also highlight the fact that, as a byproduct, the algorithm proposed provides an accurate estimate of the optimal ROC curve.
منابع مشابه
Overlaying classifiers: a practical approach for optimal ranking
ROC curves are one of the most widely used displays to evaluate performance of scoring functions. In the paper, we propose a statistical method for directly optimizing the ROC curve. The target is known to be the regression function up to an increasing transformation and this boils down to recovering the level sets of the latter. We propose to use classifiers obtained by empirical risk minimiza...
متن کاملAdaptive Estimation of the Optimal ROC Curve and a Bipartite Ranking Algorithm
In this paper, we propose an adaptive algorithm for bipartite ranking and prove its statistical performance in a stronger sense than the AUC criterion. Our procedure builds on the RankOver algorithm proposed in (Clémençon & Vayatis, 2008a). The algorithm outputs a piecewise constant scoring rule which is obtained by overlaying a finite collection of classifiers. Here, each of these classifiers ...
متن کاملA Preprocessing Technique to Investigate the Stability of Multi-Objective Heuristic Ensemble Classifiers
Background and Objectives: According to the random nature of heuristic algorithms, stability analysis of heuristic ensemble classifiers has particular importance. Methods: The novelty of this paper is using a statistical method consists of Plackett-Burman design, and Taguchi for the first time to specify not only important parameters, but also optimal levels for them. Minitab and Design Expert ...
متن کاملFuzzy Apriori Rule Extraction Using Multi-Objective Particle Swarm Optimization: The Case of Credit Scoring
There are many methods introduced to solve the credit scoring problem such as support vector machines, neural networks and rule based classifiers. Rule bases are more favourite in credit decision making because of their ability to explicitly distinguish between good and bad applicants.In this paper multi-objective particle swarm is applied to optimize fuzzy apriori rule base in credit scoring. ...
متن کاملFuzzy Apriori Rule Extraction Using Multi-Objective Particle Swarm Optimization: The Case of Credit Scoring
There are many methods introduced to solve the credit scoring problem such as support vector machines, neural networks and rule based classifiers. Rule bases are more favourite in credit decision making because of their ability to explicitly distinguish between good and bad applicants.In this paper multi-objective particle swarm is applied to optimize fuzzy apriori rule base in credit scoring. ...
متن کامل